Generic Sentence Fusion Is An Ill-Defined Summarization Task
نویسندگان
چکیده
We report on a series of human evaluations of the task of sentence fusion. In this task, a human is given two sentences and asked to produce a single coherent sentence that contains only the important information from the original two. Thus, this is a highly constrained summarization task. Our investigations show that even at this restricted level, there is no measurable agreement between humans regarding what information should be considered important. We further investigate the ability of separate evaluators to assess summaries, and find similarly disturbing lack of agreement.
منابع مشابه
Query-based Sentence Fusion is Better Defined and Leads to More Preferred Results than Generic Sentence Fusion
We show that question-based sentence fusion is a better defined task than generic sentence fusion (Q-based fusions are shorter, display less variety in length, yield more identical results and have higher normalized Rouge scores). Moreover, we show that in a QA setting, participants strongly prefer Q-based fusions over generic ones, and have a preference for union over intersection fusions.
متن کاملThe Potential And Limitations Of Automatic Sentence Extraction For Summarization
In this paper we present an empirical study of the potential and limitation of sentence extraction in text summarization. Our results show that the single document generic summarization task as defined in DUC 2001 needs to be carefully refocused as reflected in the low inter-human agreement at 100-word 1 (0.40 score) and high upper bound at full text 2 (0.88) summaries. For 100-word summaries, ...
متن کاملSIMBA: An Extractive Multi-document Summarization System for Portuguese
This is a proposal for demonstration of simba in PROPOR 2012. simba is an extractive multi-document summarization system that aims at producing generic summaries guided by a compression rate defined by the user. It uses a double-clustering approach to find the relevant information in a set of texts. In addition, simba uses a sentence simplification procedure as a mean to ensure summary compress...
متن کاملTowards Constructing Sports News from Live Text Commentary
In this paper, we investigate the possibility to automatically generate sports news from live text commentary scripts. As a preliminary study, we treat this task as a special kind of document summarization based on sentence extraction. We formulate the task in a supervised learning to rank framework, utilizing both traditional sentence features for generic document summarization and novelly des...
متن کاملGenerating Summaries Using Sentence Compression and Statistical Measures
In this paper, we propose a compression based multi-document summarization technique by incorporating word bigram probability and word co-occurrence measure. First we implemented a graph based technique to achieve sentence compression and information fusion. In the second step, we use hand-crafted rule based syntactic constraint to prune our compressed sentences. Finally we use probabilistic me...
متن کامل